Delaware Bay
SeafloorAI: A Large-scale Vision-Language Dataset for Seafloor Geological Survey Kien X. Nguyen
A major obstacle to the advancements of machine learning models in marine science, particularly in sonar imagery analysis, is the scarcity of AI-ready datasets. While there have been efforts to make AI-ready sonar image dataset publicly available, they suffer from limitations in terms of environment setting and scale.
- North America > United States > Massachusetts (0.05)
- North America > United States > California (0.04)
- North America > United States > South Carolina (0.04)
- (11 more...)
- Energy (0.70)
- Health & Medicine > Diagnostic Medicine > Imaging (0.67)
- Government > Regional Government > North America Government > United States Government (0.47)
Projecting U.S. coastal storm surge risks and impacts with deep learning
Rice, Julian R., Balaguru, Karthik, Rollano, Fadia Ticona, Wilson, John, Daniel, Brent, Judi, David, Sun, Ning, Leung, L. Ruby
Storm surge is one of the deadliest hazards posed by tropical cyclones (TCs), yet assessing its current and future risk is difficult due to the phenomenon's rarity and physical complexity. Recent advances in artificial intelligence applications to natural hazard modeling suggest a new avenue for addressing this problem. We utilize a deep learning storm surge model to efficiently estimate coastal surge risk in the United States from 900,000 synthetic TC events, accounting for projected changes in TC behavior and sea levels. The derived historical 100-year surge (the event with a 1% yearly exceedance probability) agrees well with historical observations and other modeling techniques. When coupled with an inundation model, we find that heightened TC intensities and sea levels by the end of the century result in a 50% increase in population at risk. Key findings include markedly heightened risk in Florida, and critical thresholds identified in Georgia and South Carolina.
- North America > United States > South Carolina (0.25)
- North America > United States > Virginia (0.14)
- North America > United States > Maryland (0.14)
- (16 more...)
- Research Report > New Finding (0.46)
- Research Report > Experimental Study (0.46)
- Government > Regional Government > North America Government > United States Government (1.00)
- Health & Medicine (0.93)
- Energy > Renewable (0.67)
Is AI currently capable of identifying wild oysters? A comparison of human annotators against the AI model, ODYSSEE
Campbell, Brendan, Williams, Alan, Baxevani, Kleio, Campbell, Alyssa, Dhoke, Rushabh, Hudock, Rileigh E., Lin, Xiaomin, Mange, Vivek, Neuberger, Bernhard, Suresh, Arjun, Vera, Alhim, Trembanis, Arthur, Tanner, Herbert G., Hale, Edward
Oysters are ecologically and commercially important species that require frequent monitoring to track population demographics (e.g. abundance, growth, mortality). Current methods of monitoring oyster reefs often require destructive sampling methods and extensive manual effort. Therefore, they are suboptimal for small-scale or sensitive environments. A recent alternative, the ODYSSEE model, was developed to use deep learning techniques to identify live oysters using video or images taken in the field of oyster reefs to assess abundance. The validity of this model in identifying live oysters on a reef was compared to expert and non-expert annotators. In addition, we identified potential sources of prediction error. Although the model can make inferences significantly faster than expert and non-expert annotators (39.6 s, $2.34 \pm 0.61$ h, $4.50 \pm 1.46$ h, respectively), the model overpredicted the number of live oysters, achieving lower accuracy (63\%) in identifying live oysters compared to experts (74\%) and non-experts (75\%) alike. Image quality was an important factor in determining the accuracy of the model and the annotators. Better quality images improved human accuracy and worsened model accuracy. Although ODYSSEE was not sufficiently accurate, we anticipate that future training on higher-quality images, utilizing additional live imagery, and incorporating additional annotation training classes will greatly improve the model's predictive power based on the results of this analysis. Future research should address methods that improve the detection of living vs. dead oysters.
- North America > United States > Virginia (0.14)
- North America > United States > New Jersey (0.14)
- North America > United States > Maryland > Prince George's County > College Park (0.14)
- (13 more...)
- Government > Regional Government > North America Government > United States Government (1.00)
- Food & Agriculture > Fishing (0.94)
- Food & Agriculture > Agriculture (0.68)
- Health & Medicine (0.68)
SeafloorAI: A Large-scale Vision-Language Dataset for Seafloor Geological Survey
Nguyen, Kien X., Qiao, Fengchun, Trembanis, Arthur, Peng, Xi
A major obstacle to the advancements of machine learning models in marine science, particularly in sonar imagery analysis, is the scarcity of AI-ready datasets. While there have been efforts to make AI-ready sonar image dataset publicly available, they suffer from limitations in terms of environment setting and scale. To bridge this gap, we introduce SeafloorAI, the first extensive AI-ready datasets for seafloor mapping across 5 geological layers that is curated in collaboration with marine scientists. We further extend the dataset to SeafloorGenAI by incorporating the language component in order to facilitate the development of both vision- and language-capable machine learning models for sonar imagery. The dataset consists of 62 geo-distributed data surveys spanning 17,300 square kilometers, with 696K sonar images, 827K annotated segmentation masks, 696K detailed language descriptions and approximately 7M question-answer pairs. By making our data processing source code publicly available, we aim to engage the marine science community to enrich the data pool and inspire the machine learning community to develop more robust models. This collaborative approach will enhance the capabilities and applications of our datasets within both fields.
- North America > United States > Massachusetts (0.04)
- North America > United States > California (0.04)
- North America > United States > South Carolina (0.04)
- (12 more...)
- Energy (0.69)
- Health & Medicine > Diagnostic Medicine > Imaging (0.67)
- Information Technology (0.48)
- Government > Regional Government > North America Government > United States Government (0.47)
Word2Wave: Language Driven Mission Programming for Efficient Subsea Deployments of Marine Robots
Chen, Ruo, Blow, David, Abdullah, Adnan, Islam, Md Jahidul
This paper explores the design and development of a language-based interface for dynamic mission programming of autonomous underwater vehicles (AUVs). The proposed 'Word2Wave' (W2W) framework enables interactive programming and parameter configuration of AUVs for remote subsea missions. The W2W framework includes: (i) a set of novel language rules and command structures for efficient language-to-mission mapping; (ii) a GPT-based prompt engineering module for training data generation; (iii) a small language model (SLM)-based sequence-to-sequence learning pipeline for mission command generation from human speech or text; and (iv) a novel user interface for 2D mission map visualization and human-machine interfacing. The proposed learning pipeline adapts an SLM named T5-Small that can learn language-to-mission mapping from processed language data effectively, providing robust and efficient performance. In addition to a benchmark evaluation with state-of-the-art, we conduct a user interaction study to demonstrate the effectiveness of W2W over commercial AUV programming interfaces. Across participants, W2W-based programming required less than 10% time for mission programming compared to traditional interfaces; it is deemed to be a simpler and more natural paradigm for subsea mission programming with a usability score of 76.25. W2W opens up promising future research opportunities on hands-free AUV mission programming for efficient subsea deployments.
- North America > United States > New Jersey (0.04)
- North America > United States > Delaware (0.04)
- Atlantic Ocean > North Atlantic Ocean > Delaware Bay (0.04)
- (5 more...)
ODYSSEE: Oyster Detection Yielded by Sensor Systems on Edge Electronics
Lin, Xiaomin, Mange, Vivek, Suresh, Arjun, Neuberger, Bernhard, Palnitkar, Aadi, Campbell, Brendan, Williams, Alan, Baxevani, Kleio, Mallette, Jeremy, Vera, Alhim, Vincze, Markus, Rekleitis, Ioannis, Tanner, Herbert G., Aloimonos, Yiannis
Oysters are a vital keystone species in coastal ecosystems, providing significant economic, environmental, and cultural benefits. As the importance of oysters grows, so does the relevance of autonomous systems for their detection and monitoring. However, current monitoring strategies often rely on destructive methods. While manual identification of oysters from video footage is non-destructive, it is time-consuming, requires expert input, and is further complicated by the challenges of the underwater environment. To address these challenges, we propose a novel pipeline using stable diffusion to augment a collected real dataset with realistic synthetic data. This method enhances the dataset used to train a YOLOv10-based vision model. The model is then deployed and tested on an edge platform in underwater robotics, achieving a state-of-the-art 0.657 mAP@50 for oyster detection on the Aqua2 platform.
- North America > United States > New Jersey (0.14)
- North America > United States > Texas (0.14)
- North America > United States > Delaware > New Castle County > Newark (0.14)
- (10 more...)
LLMs can learn self-restraint through iterative self-reflection
Piché, Alexandre, Milios, Aristides, Bahdanau, Dzmitry, Pal, Chris
In order to be deployed safely, Large Language Models (LLMs) must be capable of dynamically adapting their behavior based on their level of knowledge and uncertainty associated with specific topics. This adaptive behavior, which we refer to as self-restraint, is non-trivial to teach since it depends on the internal knowledge of an LLM. By default, LLMs are trained to maximize the next token likelihood, which does not teach the model to modulate its answer based on its level of uncertainty. In order to learn self-restraint, we devise a utility function that can encourage the model to produce responses only when it is confident in them. This utility function can be used to score generation of different length and abstention. To optimize this function, we introduce ReSearch, a process of "self-reflection" consisting of iterative self-prompting and self-evaluation. We use the ReSearch algorithm to generate synthetic data on which we finetune our models. Compared to their original versions, our resulting models generate fewer \emph{hallucinations} overall at no additional inference cost, for both known and unknown topics, as the model learns to selectively restrain itself. In addition, our method elegantly incorporates the ability to abstain by augmenting the samples generated by the model during the search procedure with an answer expressing abstention.
- North America > Canada > Quebec > Montreal (0.14)
- North America > United States > New Jersey (0.06)
- North America > United States > Delaware (0.06)
- (5 more...)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
- Government > Military (0.68)
RASPNet: A Benchmark Dataset for Radar Adaptive Signal Processing Applications
Venkatasubramanian, Shyam, Kang, Bosung, Pezeshki, Ali, Rangaswamy, Muralidhar, Tarokh, Vahid
This work presents a large-scale dataset for radar adaptive signal processing (RASP) applications, aimed at supporting the development of data-driven models within the radar community. The dataset, called RASPNet, consists of 100 realistic scenarios compiled over a variety of topographies and land types from across the contiguous United States, designed to reflect a diverse array of real-world environments. Within each scenario, RASPNet consists of 10,000 clutter realizations from an airborne radar setting, which can be utilized for radar algorithm development and evaluation. RASPNet intends to fill a prominent gap in the availability of a large-scale, realistic dataset that standardizes the evaluation of adaptive radar processing techniques. We describe its construction, organization, and several potential applications, which includes a transfer learning example to demonstrate how RASPNet can be leveraged for realistic adaptive radar processing scenarios.
- North America > United States > Utah (0.46)
- North America > United States > Montana (0.28)
- North America > United States > Idaho (0.28)
- (49 more...)
- Energy (0.67)
- Government > Military (0.46)
- Government > Regional Government > North America Government > United States Government (0.45)
Learning Representations and Agents for Information Retrieval
A goal shared by artificial intelligence and information retrieval is to create an oracle, that is, a machine that can answer our questions, no matter how difficult they are. A more limited, but still instrumental, version of this oracle is a question-answering system, in which an open-ended question is given to the machine, and an answer is produced based on the knowledge it has access to. Such systems already exist and are increasingly capable of answering complicated questions. This progress can be partially attributed to the recent success of machine learning and to the efficient methods for storing and retrieving information, most notably through web search engines. One can imagine that this general-purpose question-answering system can be built as a billion-parameters neural network trained end-to-end with a large number of pairs of questions and answers. We argue, however, that although this approach has been very successful for tasks such as machine translation, storing the world's knowledge as parameters of a learning machine can be very hard. A more efficient way is to train an artificial agent on how to use an external retrieval system to collect relevant information. This agent can leverage the effort that has been put into designing and running efficient storage and retrieval systems by learning how to best utilize them to accomplish a task. ...
- North America > United States > New Jersey (0.14)
- North America > United States > California > Alameda County > Berkeley (0.14)
- Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.04)
- (13 more...)
- Leisure & Entertainment > Games (1.00)
- Media (0.93)
- Health & Medicine (0.93)
- Education (0.67)
- Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
CoQA: A Conversational Question Answering Challenge
Reddy, Siva, Chen, Danqi, Manning, Christopher D.
Humans gather information by engaging in conversations involving a series of interconnected questions and answers. For machines to assist in information gathering, it is therefore essential to enable them to answer conversational questions. We introduce CoQA, a novel dataset for building Conversational Question Answering systems. Our dataset contains 127k questions with answers, obtained from 8k conversations about text passages from seven diverse domains. The questions are conversational, and the answers are free-form text with their corresponding evidence highlighted in the passage. We analyze CoQA in depth and show that conversational questions have challenging phenomena not present in existing reading comprehension datasets, e.g., coreference and pragmatic reasoning. We evaluate strong conversational and reading comprehension models on CoQA. The best system obtains an F1 score of 65.1%, which is 23.7 points behind human performance (88.8%), indicating there is ample room for improvement. We launch CoQA as a challenge to the community at http://stanfordnlp.github.io/coqa/
- North America > United States > New Jersey (0.14)
- North America > United States > Virginia (0.05)
- North America > United States > New York (0.05)
- (7 more...)
- Government > Regional Government (0.68)
- Government > Voting & Elections (0.68)
- Education > Curriculum > Subject-Specific Education (0.67)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
- Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.55)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.48)